Skip to content

fix: migrate Langfuse integration from start_generation to start_obse…#14205

Merged
wangq8 merged 3 commits intoinfiniflow:mainfrom
RazmikGevorgyan:fix/langfuse-v4-compat
Apr 24, 2026
Merged

fix: migrate Langfuse integration from start_generation to start_obse…#14205
wangq8 merged 3 commits intoinfiniflow:mainfrom
RazmikGevorgyan:fix/langfuse-v4-compat

Conversation

@RazmikGevorgyan
Copy link
Copy Markdown
Contributor

@RazmikGevorgyan RazmikGevorgyan commented Apr 17, 2026

…rvation

The Langfuse Python SDK v3+ removed start_generation() method. RagFlow's code called this non-existent method, causing AttributeError when Langfuse tracing is enabled.

Replace all start_generation() calls with start_observation(as_type="generation") which is the correct v4 SDK API.

Affected files:

  • api/db/services/llm_service.py (12 occurrences)
  • api/db/services/dialog_service.py (1 occurrence)

Fixes #14204
Related to #9243

What problem does this PR solve?

Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR.

Type of change

  • Bug Fix (non-breaking change which fixes an issue)

…rvation

The Langfuse Python SDK v3+ removed `start_generation()` method.
RagFlow's code called this non-existent method, causing AttributeError
when Langfuse tracing is enabled.

Replace all `start_generation()` calls with `start_observation(as_type="generation")`
which is the correct v4 SDK API.

Affected files:
- api/db/services/llm_service.py (12 occurrences)
- api/db/services/dialog_service.py (1 occurrence)

Fixes infiniflow#14204
Related to infiniflow#9243

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. 🐞 bug Something isn't working, pull request that fix bug. labels Apr 17, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0f7796d1-18b4-4289-b7d1-fb7a1a1cc24f

📥 Commits

Reviewing files that changed from the base of the PR and between fd30d00 and f132aa2.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • pyproject.toml
✅ Files skipped from review due to trivial changes (1)
  • pyproject.toml

📝 Walkthrough

Walkthrough

Replaced Langfuse callsites that used start_generation(...) with start_observation(..., as_type="generation", ...) in api/db/services/llm_service.py and api/db/services/dialog_service.py. Also bumped langfuse version requirement in pyproject.toml from >=2.60.0 to >=4.0.1. Trace context, names, inputs, and subsequent update()/end() usage are preserved.

Changes

Cohort / File(s) Summary
Dialog service
api/db/services/dialog_service.py
Replaced start_generation(...) with start_observation(..., as_type="generation", ...) in async_chat; preserved trace_context, name="chat", input payload, and downstream generation.update() / generation.end() usage.
LLM service
api/db/services/llm_service.py
Replaced multiple start_generation(...) calls with start_observation(..., as_type="generation", ...) across LLMBundle methods (embeddings, vision, speech, TTS, async chat and streaming variants); preserved trace_context, span names, input/metadata, and subsequent update()/end() calls; added docstrings to touched public methods.
Project config
pyproject.toml
Bumped langfuse dependency requirement from >=2.60.0 to >=4.0.1.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through lines both short and long,
Replaced a call that simply didn't belong.
Observations now hum where generations lay,
Traces kept tidy as I bounded away.
🥕✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes the langfuse dependency version upgrade (2.60.0→4.0.1) and docstring additions, which are not required by the linked issues. Justify the dependency upgrade and docstring additions as part of this PR scope, or move them to separate PRs if they are independent refactoring changes.
Docstring Coverage ⚠️ Warning Docstring coverage is 73.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main change: migrating Langfuse integration from start_generation to start_observation API.
Description check ✅ Passed The PR description covers the problem (removed start_generation method in Langfuse v3+), the solution (replace with start_observation), affected files, and references linked issues.
Linked Issues check ✅ Passed The code changes directly address all primary objectives from issue #14204: replacing start_generation calls with start_observation(as_type="generation") across both affected files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Apr 18, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/db/services/dialog_service.py`:
- Around line 513-514: The trace_id is generated with str(uuid.uuid4()) which
yields a hyphenated 36-char UUID and is incompatible with Langfuse; replace that
generation so trace_context["trace_id"] is a 32-character lowercase hex string
(use langfuse.create_trace_id() if available or uuid.uuid4().hex) where trace_id
is set and trace_context is built (references: trace_id, trace_context in this
module).
- Around line 779-783: The cleanup guard should not rely on
"langfuse_generation" appearing in locals(); initialize langfuse_generation =
None before the if langfuse_tracer block (where you call
langfuse_tracer.start_observation) and then in the finalization/cleanup use a
direct test like "if langfuse_generation is not None:" (or truthiness) to decide
whether to call finish/close on langfuse_generation; this ensures references to
langfuse_generation (created by start_observation) are safe in the nested
cleanup code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ad5f3804-5f7f-4f9c-9a43-893334ccadcb

📥 Commits

Reviewing files that changed from the base of the PR and between 57f2c6f and 5b75616.

📒 Files selected for processing (2)
  • api/db/services/dialog_service.py
  • api/db/services/llm_service.py
✅ Files skipped from review due to trivial changes (1)
  • api/db/services/llm_service.py

Comment thread api/db/services/dialog_service.py Outdated
Comment on lines 779 to 783
if langfuse_tracer:
langfuse_generation = langfuse_tracer.start_generation(
langfuse_generation = langfuse_tracer.start_observation(as_type="generation",
trace_context=trace_context, name="chat", model=llm_model_config["llm_name"],
input={"prompt": prompt, "prompt4citation": prompt4citation, "messages": msg}
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether `decorate_answer` directly closes over `langfuse_generation`.
python - <<'PY'
import ast
from pathlib import Path

path = Path("api/db/services/dialog_service.py")
tree = ast.parse(path.read_text())

async_chat = next(
    node for node in tree.body
    if isinstance(node, ast.AsyncFunctionDef) and node.name == "async_chat"
)
decorate_answer = next(
    node for node in async_chat.body
    if isinstance(node, ast.FunctionDef) and node.name == "decorate_answer"
)

names = {n.id for n in ast.walk(decorate_answer) if isinstance(n, ast.Name)}
string_check = any(
    isinstance(n, ast.Constant) and n.value == "langfuse_generation"
    for n in ast.walk(decorate_answer)
)

print("direct_name_reference =", "langfuse_generation" in names)
print("string_check_present =", string_check)
PY

Repository: infiniflow/ragflow

Length of output: 118


🏁 Script executed:

#!/bin/bash
# First, verify the file exists and check the relevant lines
head -n 800 api/db/services/dialog_service.py | tail -n 50

Repository: infiniflow/ragflow

Length of output: 2836


🏁 Script executed:

#!/bin/bash
# Search for the cleanup path that checks langfuse_generation
rg -n "langfuse_generation" api/db/services/dialog_service.py -A 2 -B 2

Repository: infiniflow/ragflow

Length of output: 1006


Fix the langfuse observation finalization guard condition.

The cleanup path at line 771 uses "langfuse_generation" in locals() which may fail even though the variable is directly referenced on lines 774–775. In a nested function, variables from the parent scope don't reliably appear in locals() before they're first assigned locally. Change the guard to directly test the variable instead:

Suggested fix
     langfuse_tracer = None
+    langfuse_generation = None
     trace_context = {}
@@
-        if langfuse_tracer and "langfuse_generation" in locals():
+        if langfuse_generation is not None:
             langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL)
             langfuse_output = {"time_elapsed:": re.sub(r"\n", "  \n", langfuse_output), "created_at": time.time()}
             langfuse_generation.update(output=langfuse_output)
             langfuse_generation.end()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/db/services/dialog_service.py` around lines 779 - 783, The cleanup guard
should not rely on "langfuse_generation" appearing in locals(); initialize
langfuse_generation = None before the if langfuse_tracer block (where you call
langfuse_tracer.start_observation) and then in the finalization/cleanup use a
direct test like "if langfuse_generation is not None:" (or truthiness) to decide
whether to call finish/close on langfuse_generation; this ensures references to
langfuse_generation (created by start_observation) are safe in the nested
cleanup code.

@RazmikGevorgyan RazmikGevorgyan force-pushed the fix/langfuse-v4-compat branch 2 times, most recently from f82a1d7 to 57f2c6f Compare April 18, 2026 09:09
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Apr 18, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
api/db/services/dialog_service.py (1)

779-783: ⚠️ Potential issue | 🟠 Major

Finalize guard still prevents observation close in decorate_answer.

langfuse_generation is created here, but Line [771] checks "langfuse_generation" in locals() inside the nested function. That can skip .update()/.end() even when the observation exists, leaving traces incomplete.

Suggested fix
-    langfuse_tracer = None
+    langfuse_tracer = None
+    langfuse_generation = None
     trace_context = {}
@@
-        if langfuse_tracer and "langfuse_generation" in locals():
+        if langfuse_generation is not None:
             langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL)
             langfuse_output = {"time_elapsed:": re.sub(r"\n", "  \n", langfuse_output), "created_at": time.time()}
             langfuse_generation.update(output=langfuse_output)
             langfuse_generation.end()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/db/services/dialog_service.py` around lines 779 - 783, The guard using
"langfuse_generation" in locals() inside decorate_answer can miss an existing
observation and skip .update()/.end(); to fix, initialize langfuse_generation =
None in the outer scope before the if langfuse_tracer block so the name is
always defined, then in the nested cleanup/close function check "if
langfuse_generation is not None" (or truthy) and call
langfuse_generation.update(...) / langfuse_generation.end(); if the nested
function needs to assign to langfuse_generation, add a nonlocal
langfuse_generation declaration to allow modifying the outer variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@api/db/services/dialog_service.py`:
- Around line 779-783: The guard using "langfuse_generation" in locals() inside
decorate_answer can miss an existing observation and skip .update()/.end(); to
fix, initialize langfuse_generation = None in the outer scope before the if
langfuse_tracer block so the name is always defined, then in the nested
cleanup/close function check "if langfuse_generation is not None" (or truthy)
and call langfuse_generation.update(...) / langfuse_generation.end(); if the
nested function needs to assign to langfuse_generation, add a nonlocal
langfuse_generation declaration to allow modifying the outer variable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 00679a3d-4f00-4181-8081-aef9f0163f9f

📥 Commits

Reviewing files that changed from the base of the PR and between 5b75616 and f82a1d7.

📒 Files selected for processing (2)
  • api/db/services/dialog_service.py
  • api/db/services/llm_service.py
✅ Files skipped from review due to trivial changes (1)
  • api/db/services/llm_service.py

@yingfeng yingfeng requested a review from Lynn-Inf April 20, 2026 11:27
Copy link
Copy Markdown
Contributor

@Lynn-Inf Lynn-Inf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I agree with the changes.

One additional thing: besides updating the code to work with the v4 SDK, please also update the version constraints in pyproject.toml to match the changes in uv.lock. I see that uv.lock has already been updated to 4.0.1, so please update pyproject.toml accordingly as well.

@Wbebey
Copy link
Copy Markdown

Wbebey commented Apr 23, 2026

@RazmikGevorgyan , merging this PR could unlock me and may others 🙏🏿

@RazmikGevorgyan
Copy link
Copy Markdown
Contributor Author

RazmikGevorgyan commented Apr 23, 2026

pushed fix, pls merge if it looks ok

@wangq8 wangq8 merged commit c41b5e8 into infiniflow:main Apr 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐞 bug Something isn't working, pull request that fix bug. size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Langfuse integration broken in latest Docker image - 'start_generation' AttributeError

4 participants